588 research outputs found
Data-Driven Sparse Structure Selection for Deep Neural Networks
Deep convolutional neural networks have liberated its extraordinary power on
various tasks. However, it is still very challenging to deploy state-of-the-art
models into real-world applications due to their high computational complexity.
How can we design a compact and effective network without massive experiments
and expert knowledge? In this paper, we propose a simple and effective
framework to learn and prune deep models in an end-to-end manner. In our
framework, a new type of parameter -- scaling factor is first introduced to
scale the outputs of specific structures, such as neurons, groups or residual
blocks. Then we add sparsity regularizations on these factors, and solve this
optimization problem by a modified stochastic Accelerated Proximal Gradient
(APG) method. By forcing some of the factors to zero, we can safely remove the
corresponding structures, thus prune the unimportant parts of a CNN. Comparing
with other structure selection methods that may need thousands of trials or
iterative fine-tuning, our method is trained fully end-to-end in one training
pass without bells and whistles. We evaluate our method, Sparse Structure
Selection with several state-of-the-art CNNs, and demonstrate very promising
results with adaptive depth and width selection.Comment: ECCV Camera ready versio
Food Ingredients Recognition through Multi-label Learning
Automatically constructing a food diary that tracks the ingredients consumed
can help people follow a healthy diet. We tackle the problem of food
ingredients recognition as a multi-label learning problem. We propose a method
for adapting a highly performing state of the art CNN in order to act as a
multi-label predictor for learning recipes in terms of their list of
ingredients. We prove that our model is able to, given a picture, predict its
list of ingredients, even if the recipe corresponding to the picture has never
been seen by the model. We make public two new datasets suitable for this
purpose. Furthermore, we prove that a model trained with a high variability of
recipes and ingredients is able to generalize better on new data, and visualize
how it specializes each of its neurons to different ingredients.Comment: 8 page
MTDeep: Boosting the Security of Deep Neural Nets Against Adversarial Attacks with Moving Target Defense
Present attack methods can make state-of-the-art classification systems based
on deep neural networks misclassify every adversarially modified test example.
The design of general defense strategies against a wide range of such attacks
still remains a challenging problem. In this paper, we draw inspiration from
the fields of cybersecurity and multi-agent systems and propose to leverage the
concept of Moving Target Defense (MTD) in designing a meta-defense for
'boosting' the robustness of an ensemble of deep neural networks (DNNs) for
visual classification tasks against such adversarial attacks. To classify an
input image, a trained network is picked randomly from this set of networks by
formulating the interaction between a Defender (who hosts the classification
networks) and their (Legitimate and Malicious) users as a Bayesian Stackelberg
Game (BSG). We empirically show that this approach, MTDeep, reduces
misclassification on perturbed images in various datasets such as MNIST,
FashionMNIST, and ImageNet while maintaining high classification accuracy on
legitimate test images. We then demonstrate that our framework, being the first
meta-defense technique, can be used in conjunction with any existing defense
mechanism to provide more resilience against adversarial attacks that can be
afforded by these defense mechanisms. Lastly, to quantify the increase in
robustness of an ensemble-based classification system when we use MTDeep, we
analyze the properties of a set of DNNs and introduce the concept of
differential immunity that formalizes the notion of attack transferability.Comment: Accepted to the Conference on Decision and Game Theory for Security
(GameSec), 201
DeeSIL: Deep-Shallow Incremental Learning
Incremental Learning (IL) is an interesting AI problem when the algorithm is
assumed to work on a budget. This is especially true when IL is modeled using a
deep learning approach, where two com- plex challenges arise due to limited
memory, which induces catastrophic forgetting and delays related to the
retraining needed in order to incorpo- rate new classes. Here we introduce
DeeSIL, an adaptation of a known transfer learning scheme that combines a fixed
deep representation used as feature extractor and learning independent shallow
classifiers to in- crease recognition capacity. This scheme tackles the two
aforementioned challenges since it works well with a limited memory budget and
each new concept can be added within a minute. Moreover, since no deep re-
training is needed when the model is incremented, DeeSIL can integrate larger
amounts of initial data that provide more transferable features. Performance is
evaluated on ImageNet LSVRC 2012 against three state of the art algorithms.
Results show that, at scale, DeeSIL performance is 23 and 33 points higher than
the best baseline when using the same and more initial data respectively
Adversarial attacks hidden in plain sight
Convolutional neural networks have been used to achieve a string of successes
during recent years, but their lack of interpretability remains a serious
issue. Adversarial examples are designed to deliberately fool neural networks
into making any desired incorrect classification, potentially with very high
certainty. Several defensive approaches increase robustness against adversarial
attacks, demanding attacks of greater magnitude, which lead to visible
artifacts. By considering human visual perception, we compose a technique that
allows to hide such adversarial attacks in regions of high complexity, such
that they are imperceptible even to an astute observer. We carry out a user
study on classifying adversarially modified images to validate the perceptual
quality of our approach and find significant evidence for its concealment with
regards to human visual perception
Recycle-GAN: Unsupervised Video Retargeting
We introduce a data-driven approach for unsupervised video retargeting that
translates content from one domain to another while preserving the style native
to a domain, i.e., if contents of John Oliver's speech were to be transferred
to Stephen Colbert, then the generated content/speech should be in Stephen
Colbert's style. Our approach combines both spatial and temporal information
along with adversarial losses for content translation and style preservation.
In this work, we first study the advantages of using spatiotemporal constraints
over spatial constraints for effective retargeting. We then demonstrate the
proposed approach for the problems where information in both space and time
matters such as face-to-face translation, flower-to-flower, wind and cloud
synthesis, sunrise and sunset.Comment: ECCV 2018; Please refer to project webpage for videos -
http://www.cs.cmu.edu/~aayushb/Recycle-GA
Hierarchical ResNeXt Models for Breast Cancer Histology Image Classification
Microscopic histology image analysis is a cornerstone in early detection of
breast cancer. However these images are very large and manual analysis is error
prone and very time consuming. Thus automating this process is in high demand.
We proposed a hierarchical system of convolutional neural networks (CNN) that
classifies automatically patches of these images into four pathologies: normal,
benign, in situ carcinoma and invasive carcinoma. We evaluated our system on
the BACH challenge dataset of image-wise classification and a small dataset
that we used to extend it. Using a train/test split of 75%/25%, we achieved an
accuracy rate of 0.99 on the test split for the BACH dataset and 0.96 on that
of the extension. On the test of the BACH challenge, we've reached an accuracy
of 0.81 which rank us to the 8th out of 51 teams
Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks
Learning deeper convolutional neural networks becomes a tendency in recent
years. However, many empirical evidences suggest that performance improvement
cannot be gained by simply stacking more layers. In this paper, we consider the
issue from an information theoretical perspective, and propose a novel method
Relay Backpropagation, that encourages the propagation of effective information
through the network in training stage. By virtue of the method, we achieved the
first place in ILSVRC 2015 Scene Classification Challenge. Extensive
experiments on two challenging large scale datasets demonstrate the
effectiveness of our method is not restricted to a specific dataset or network
architecture. Our models will be available to the research community later.Comment: Technical report for our submissions to the ILSVRC 2015 Scene
Classification Challenge, where we won the first plac
Memory Aware Synapses: Learning what (not) to forget
Humans can learn in a continuous manner. Old rarely utilized knowledge can be
overwritten by new incoming information while important, frequently used
knowledge is prevented from being erased. In artificial learning systems,
lifelong learning so far has focused mainly on accumulating knowledge over
tasks and overcoming catastrophic forgetting. In this paper, we argue that,
given the limited model capacity and the unlimited new information to be
learned, knowledge has to be preserved or erased selectively. Inspired by
neuroplasticity, we propose a novel approach for lifelong learning, coined
Memory Aware Synapses (MAS). It computes the importance of the parameters of a
neural network in an unsupervised and online manner. Given a new sample which
is fed to the network, MAS accumulates an importance measure for each parameter
of the network, based on how sensitive the predicted output function is to a
change in this parameter. When learning a new task, changes to important
parameters can then be penalized, effectively preventing important knowledge
related to previous tasks from being overwritten. Further, we show an
interesting connection between a local version of our method and Hebb's
rule,which is a model for the learning process in the brain. We test our method
on a sequence of object recognition tasks and on the challenging problem of
learning an embedding for predicting triplets.
We show state-of-the-art performance and, for the first time, the ability to
adapt the importance of the parameters based on unlabeled data towards what the
network needs (not) to forget, which may vary depending on test conditions.Comment: ECCV 201
Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian Processes
We propose a novel method for neural network quantization that casts the
neural architecture search problem as one of hyperparameter search to find
non-uniform bit distributions throughout the layers of a CNN. We perform the
search assuming a Multi-Task Gaussian Processes prior, which splits the problem
to multiple tasks, each corresponding to different number of training epochs,
and explore the space by sampling those configurations that yield maximum
information. We then show that with significantly lower precision in the last
layers we achieve a minimal loss of accuracy with appreciable memory savings.
We test our findings on the CIFAR10 and ImageNet datasets using the VGG, ResNet
and GoogLeNet architectures.Comment: Accepted for publication at ECCV 2020. Code availiable at
https://code.active.vision . Updated for typ
- …